Speech enhancement based on spectral estimation from higher-lag autocorrelation
نویسندگان
چکیده
In this paper, we propose a unique approach to enhance speech signals that have been corrupted by non-stationary noises. This approach is not based on a spectral subtraction algorithm, but on an algorithm that separates the speech signal and noise signal contributions in the autocorrelation domain. We call this technique the AR-HASE speech enhancement algorithm. In this initial study, we evaluate the performance of the new algorithm using the average PESQ score computed from 10 male utterances and 10 female utterances taken from the TIMIT database as a measure of speech quality. We test the algorithm using one broadband stationary noise and two non-stationary noises. We will show that the AR-HASE enhancement algorithm produces near transparent quality for clean speech, gives poor enhancement performance for broadband stationary noises, and gives significantly enhanced quality for the two non-stationary noises.
منابع مشابه
Feature extraction from higher-lag autocorrelation coefficients for robust speech recognition
In this paper, a feature extraction method that is robust to additive background noise is proposed for automatic speech recognition. Since the background noise corrupts the autocorrelation coefficients of the speech signal mostly at the lowertime lags, while the higher-lag autocorrelation coefficients are least affected, this method discards the lower-lag autocorrelation coefficients and uses o...
متن کاملOn the Use of Asymmetric Windows for Robust Speech Recognition
This paper deals with the problem of searching for a suitable window for robust speech recognition in noisy conditions. A set of asymmetric windows, socalled DDRc,w, are proposed which are controlled by two parameters, center c and width w. These windows are derived from the DDR window used in the higher-lag autocorrelation spectrum estimation (HASE) method and act over the OSA (OneSided Autoco...
متن کاملApplications of Surface Correlation to the Estimation of the Harmonic Fundamental of Speech
We present a method for estimating the fundamental frequency of harmonic signals, and apply this method to human speech. The method is based on cross-spectral methods, which provide accurate resolution of multicomponent FM signals in both time and frequency. The fundamental is re-introduced to the spectrum by a frequency-lag autocorrelation of the spectrum, even if the fundamental is completely...
متن کاملVoice Activity Detection using Temporal Characteristics of Autocorrelation Lag and Maximum Spectral Amplitude in Sub-bands
A robust voice activity detection (VAD) is a prerequisite for many speech based applications like speech recognition. We investigated two VAD techniques that use time domain and frequency domain characteristics of speech signal. The temporal characteristic of the autocorrelation lag is able to discriminate speech and nonspeech regions. In the frequency domain, peak value of the magnitude spectr...
متن کاملComparison of Different Order Cumulants in a Speech Enhancement System by Adaptive Wiener Filtering
We estudy some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim-Oppenheim [2], where the AR spectral estimation of the speech is carried out using a second-order analysis. But in our algorithms we consider an AR estimation by means of a cumulant (thirdand fourth-order) analysis. This work extends some preceding papers due to the authors, providing a behavi...
متن کامل